Pattern Matching in Text Compressed by Using Antidictionaries Yusuke

نویسندگان

  • Yusuke Shibata
  • Masayuki Takeda
  • Ayumi Shinohara
  • Setsuo Arikawa
  • S. Arikawa
چکیده

In this paper we focus on the problem of compressed pattern matching for the text compression using antidictionaries, which is a new compression scheme proposed recently by Crochemore et al. (1998). We show an algorithm which preprocesses a pattern of length m and an antidictionary M in O(m 2 + kMk) time, and then scans a compressed text of length n in O(n+ r) time to nd all pattern occurrences, where kMk is the total length of strings in M and r is the number of the pattern occurrences.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pattern Matching in Text Compressed by Using Antidictionaries

In this paper we focus on the problem of compressed pattern matching for the text compression using antidictionaries, which is a new compression scheme proposed recently by Crochemore et al. (1998). We show an algorithm which preprocesses a pattern of length m and an antidictionary M in O(m 2 + kMk) time, and then scans a compressed text of length n in O(n+ r) time to nd all pattern occurrences...

متن کامل

Pattern Matching in DCA Coded Text

A new algorithm searching all occurrences of a regular expression pattern in a text is presented. It uses only the text that has been compressed by the text compression using antidictionaries without its decompression. The proposed algorithm runs inO(2 ·||AD||+nc+r) worst case time, where m is the length of the pattern, AD is the antidictionary, nC is the length of the coded text and r is the n...

متن کامل

A Unifying Framework for Compressed Pattern Matching

We introduce a general framework which is suitable to capture an essence of compressed pattern matching according to various dictionary based compressions. The goal is to find all occurrences of a pattern in a text without decompression, which is one of the most active topics in string matching. Our framework includes such compression methods as Lempel-Ziv family, (LZ77, LZSS, LZ78, LZW), byte-...

متن کامل

Speeding Up Pattern Matching by Text Compression

Byte pair encoding (BPE) is a simple universal text compression scheme. Decompression is very fast and requires small work space. Moreover, it is easy to decompress an arbitrary part of the original text. However, it has not been so popular since the compression is rather slow and the compression ratio is not as good as other methods such as Lempel-Ziv type compression. In this paper, we bring ...

متن کامل

A Boyer-Moore Type Algorithm for Compressed Pattern Matching

We apply the Boyer–Moore technique to compressed pattern matching for text string described in terms of collage system, which is a formal framework that captures various dictionary-based compression methods. For a subclass of collage systems that contain no truncation, our new algorithm runs in O(‖D‖ + n · m + m + r) time using O(‖D‖ + m) space, where ‖D‖ is the size of dictionary D, n is the c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999